genenral notes or findings:

  • All series were found to be I(1), i.e. stationary after first differencing.

SetupΒΆ

InΒ [280]:
# relevent libraries

# install.packages("quantmod")
# install.packages("fredr")
# install.packages("ggfortify")
# install.packages('urca')
# install.packages("tseries")
# install.packages("forecast")
# install.packages("dynlm")
# install.packages("stargazer")
# install.packages("pracma")
# install.packages("dLagM")
# install.packages("gets")
# install.packages("car")
# install.packages("lmtest")
# install.packages("vars")
# install.packages("tseries")
# install.packages("strucchange")
# install.packages("graphics")
# install.packages("grDevices")
# install.packages("tsDyn")
options(warn=-1)
InΒ [281]:
library(xts)
library(zoo)
library(ggplot2)
library(ggfortify)
library(urca)
library(forecast)
InΒ [282]:
data <- read.csv('dataset.csv')

head(data)

tail(data)
A data.frame: 6 Γ— 6
DateEurope.Brent.Spot.Price.FOBMonthly.OECD.petroleum.and.other.liquids.stocksWorld.Industrial.Production.indexOPEC.Production..million.b.d.Baltic.Dry.Index..BADI.
<chr><dbl><dbl><dbl><dbl><int>
1Jan-0025.513768.491NA24.181319
2Feb-0027.783744.295NA24.951531
3Mar-0027.493731.368NA24.911660
4Apr-0022.763770.587NA25.691628
5May-0027.743789.751NA26.251566
6Jun-0029.803820.624NA25.831616
A data.frame: 6 Γ— 6
DateEurope.Brent.Spot.Price.FOBMonthly.OECD.petroleum.and.other.liquids.stocksWorld.Industrial.Production.indexOPEC.Production..million.b.d.Baltic.Dry.Index..BADI.
<chr><dbl><dbl><dbl><dbl><int>
295Jul-2485.154032.39107.2629.041708
296Aug-2480.364047.26107.3828.941814
297Sep-2474.024016.63107.6128.232084
298Oct-2475.633981.54108.1628.711388
299Nov-2474.353979.63108.1628.741354
300Dec-2473.863975.76109.0428.82 997
InΒ [283]:
# convert "Jan-90" to yearmon (monthly format)
d1 <- as.yearmon(data[,1], format = "%b-%y")

# convert to xts
Brent <- xts(data[,2], order.by = d1) # date-indexed series (Crude Oil Prices: Brent - Europe)

# we have no missing values on Brent, but we omit just in case
Brent = na.omit(Brent)              
InΒ [284]:
# we repeat the same process for the other series
Oil_Stocks <- xts(data[,3], order.by = d1)
Oil_Stocks = na.omit(Oil_Stocks)

WIP <- xts(data[,4], order.by = d1)
WIP = na.omit(WIP) # We do not have values of WIP before Jan 2016, all values before are omitted


OPEC <- xts(data[,5], order.by = d1)
OPEC = na.omit(OPEC)

BDI <- xts(data[,6], order.by = d1)
BDI = na.omit(BDI)
InΒ [285]:
head(Brent)
tail(Brent)
          [,1]
Jan 2000 25.51
Feb 2000 27.78
Mar 2000 27.49
Apr 2000 22.76
May 2000 27.74
Jun 2000 29.80
          [,1]
Jul 2024 85.15
Aug 2024 80.36
Sep 2024 74.02
Oct 2024 75.63
Nov 2024 74.35
Dec 2024 73.86

Data goes from Jan 2000 to Dec 2024

InΒ [286]:
options(repr.plot.width=30, repr.plot.height=20)

fig = autoplot(Brent, size = .8, colour = 'black')
fig = fig +
theme_light() +
theme(plot.margin = ggplot2::margin(1, 1, 1, 1, "cm")) +
theme(text=element_text(size=30)) +
labs(x = "Month -  Year") +
labs(y = "Europe Brent Spot Price FOB (Dollars per Barrel)")
fig
No description has been provided for this image
InΒ [287]:
options(repr.plot.width=30, repr.plot.height=20)

fig = autoplot(Oil_Stocks, size = .8, colour = 'black')
fig = fig +
theme_light() +
theme(plot.margin = ggplot2::margin(1, 1, 1, 1, "cm")) +
theme(text=element_text(size=30)) +
labs(x = "Month -  Year") +
labs(y = "Monthly OECD petroleum and other liquids stocks - million barrels")
fig
No description has been provided for this image
InΒ [288]:
options(repr.plot.width=30, repr.plot.height=20)

fig = autoplot(WIP, size = .8, colour = 'black')
fig = fig +
theme_light() +
theme(plot.margin = ggplot2::margin(1, 1, 1, 1, "cm")) +
theme(text=element_text(size=30)) +
labs(x = "Month -  Year") +
labs(y = "World Industrial Production index")
fig
No description has been provided for this image
InΒ [289]:
options(repr.plot.width=30, repr.plot.height=20)

fig = autoplot(OPEC, size = .8, colour = 'black')
fig = fig +
theme_light() +
theme(plot.margin = ggplot2::margin(1, 1, 1, 1, "cm")) +
theme(text=element_text(size=30)) +
labs(x = "Month -  Year") +
labs(y = "OPEC Production (million b/d)")
fig
No description has been provided for this image
InΒ [290]:
options(repr.plot.width=30, repr.plot.height=20)

fig = autoplot(BDI, size = .8, colour = 'black')
fig = fig +
theme_light() +
theme(plot.margin = ggplot2::margin(1, 1, 1, 1, "cm")) +
theme(text=element_text(size=30)) +
labs(x = "Month -  Year") +
labs(y = "Baltic Dry Index (BADI)")
fig
No description has been provided for this image
InΒ [291]:
all_series <- merge(Brent, Oil_Stocks, WIP, OPEC, BDI)
colnames(all_series) <- c("Brent", "Oil Stocks", "World Industrial Production", "OPEC Production", "Baltic Dry Index")

df_long <- fortify(all_series, melt = TRUE)
colnames(df_long) <- c("Date", "Series", "Value")

options(repr.plot.width = 30, repr.plot.height = 20)

fig <- ggplot(df_long, aes(x = Date, y = Value, colour = Series)) +
  geom_line(size = 1) +
  theme_light() +
  theme(
    # aspect.ratio = 1,
    plot.margin = ggplot2::margin(0.2, 0.2, 0.2, 0.2, "cm"),
    text = element_text(size = 30),
    legend.position = c(0.15, 0.85),
    legend.background = element_rect(fill = alpha("white", 0.6)),
    legend.key = element_rect(fill = NA)
  ) +
  labs(
    x = "Month - Year",
    y = "Value",
    colour = "Series"
  )

fig
No description has been provided for this image

ADF Testing of Brent Oil PricesΒΆ

InΒ [292]:
summary(ur.df(Brent, type='trend', lags=30, selectlags="AIC"))
############################################### 
# Augmented Dickey-Fuller Test Unit Root Test # 
############################################### 

Test regression trend 


Call:
lm(formula = z.diff ~ z.lag.1 + 1 + tt + z.diff.lag)

Residuals:
     Min       1Q   Median       3Q      Max 
-21.8417  -3.9894   0.6469   3.7461  17.4151 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)  2.902228   1.170712   2.479  0.01380 *  
z.lag.1     -0.041953   0.014119  -2.971  0.00324 ** 
tt           0.001226   0.004833   0.254  0.80003    
z.diff.lag   0.334422   0.057681   5.798 1.91e-08 ***
---
Signif. codes:  0 β€˜***’ 0.001 β€˜**’ 0.01 β€˜*’ 0.05 β€˜.’ 0.1 β€˜ ’ 1

Residual standard error: 5.948 on 265 degrees of freedom
Multiple R-squared:  0.1304,	Adjusted R-squared:  0.1206 
F-statistic: 13.25 on 3 and 265 DF,  p-value: 4.395e-08


Value of test-statistic is: -2.9714 3.0686 4.5485 

Critical values for test statistics: 
      1pct  5pct 10pct
tau3 -3.98 -3.42 -3.13
phi2  6.15  4.71  4.05
phi3  8.34  6.30  5.36

We cannot reject the null hypothesis that the constant and deterministic trend are jointly zero because $\varphi_2=3.07<4.71$.

optional: (We also cannot reject that the deterministic trend is zero because $\varphi_3=4.55<6.30$.)

Therefore we re-run the ADF test with specification = "none" (no constant, no determistic trend).

InΒ [293]:
summary(ur.df(Brent, type='none', lags=30, selectlags="AIC"))
############################################### 
# Augmented Dickey-Fuller Test Unit Root Test # 
############################################### 

Test regression none 


Call:
lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)

Residuals:
    Min      1Q  Median      3Q     Max 
-20.894  -3.198   1.357   4.168  17.064 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
z.lag.1    -0.003547   0.004836  -0.733    0.464    
z.diff.lag  0.320206   0.058071   5.514 8.27e-08 ***
---
Signif. codes:  0 β€˜***’ 0.001 β€˜**’ 0.01 β€˜*’ 0.05 β€˜.’ 0.1 β€˜ ’ 1

Residual standard error: 6.022 on 267 degrees of freedom
Multiple R-squared:  0.1027,	Adjusted R-squared:  0.09599 
F-statistic: 15.28 on 2 and 267 DF,  p-value: 5.204e-07


Value of test-statistic is: -0.7334 

Critical values for test statistics: 
      1pct  5pct 10pct
tau1 -2.58 -1.95 -1.62

We cannot reject the null that there exists a unit root at the 5pct significance level since $\tau_1=-0.7334>-1.095"$.

Therefore we difference the series. Chatgpt why we also take the log:

We log the Brent crude oil price series because crude oil prices exhibit strong multiplicative volatility: they move in percentages rather than fixed amounts, and large level increases create proportionally larger fluctuations. Taking the logarithm stabilizes the variance, making the series more homoskedastic and better aligned with the linear assumptions of ARIMA and ADL models. Log-transforming also converts differences into approximate percentage changes, so the series becomes interpretable as monthly returns, which are typically stationary even when price levels are not. This improves model performance, reduces the influence of extreme outliers, and produces more reliable forecasts compared with using raw price levels.

  1. Oil prices are strictly positive and highly volatile β†’ logs stabilize variance

Brent has:

multiplicative shocks (prices jump by percentages, not by fixed amounts)

strong heteroskedasticity

explosive episodes (2008, COVID crash, 2022 spike)

Taking logs makes the series behave more like a linear, additive process, which is exactly what ARIMA/ADL/VAR models assume.

Without logging:

large spikes dominate the model

residuals become heteroskedastic

parameters can be unstable

forecasts become scale-dependent

This is all bad.

InΒ [294]:
Brent_log <- log(Brent)
dBrent_log = diff(Brent_log)

# diff() always produces an NA in the first observation (there is no lagged value for t=1).
dBrent_log  <- na.omit(dBrent_log)
InΒ [295]:
options(repr.plot.width=30, repr.plot.height=20)

fig = autoplot(dBrent_log, size = .8, colour = 'black')
fig = fig +
theme_light() +
theme(plot.margin = ggplot2::margin(1, 1, 1, 1, "cm")) +
theme(text=element_text(size=30)) +
labs(x = "Month -  Year") +
labs(y = "Transformed Europe Brent Spot Price FOB (Dollars per Barrel)")
fig
No description has been provided for this image
InΒ [296]:
summary(ur.df(dBrent_log, type='none', lags=30, selectlags="AIC"))
############################################### 
# Augmented Dickey-Fuller Test Unit Root Test # 
############################################### 

Test regression none 


Call:
lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.49795 -0.05073  0.01671  0.06067  0.54964 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
z.lag.1     -0.96292    0.13098  -7.352  2.5e-12 ***
z.diff.lag1  0.26852    0.11993   2.239   0.0260 *  
z.diff.lag2  0.13147    0.10510   1.251   0.2121    
z.diff.lag3  0.08946    0.09048   0.989   0.3237    
z.diff.lag4 -0.01551    0.07460  -0.208   0.8355    
z.diff.lag5  0.11071    0.06115   1.810   0.0714 .  
---
Signif. codes:  0 β€˜***’ 0.001 β€˜**’ 0.01 β€˜*’ 0.05 β€˜.’ 0.1 β€˜ ’ 1

Residual standard error: 0.0995 on 262 degrees of freedom
Multiple R-squared:  0.3991,	Adjusted R-squared:  0.3854 
F-statistic: 29.01 on 6 and 262 DF,  p-value: < 2.2e-16


Value of test-statistic is: -7.3519 

Critical values for test statistics: 
      1pct  5pct 10pct
tau1 -2.58 -1.95 -1.62

We reject the null of a unit root, meaning that the series is stationary. I.e. there is enough evidence to conclude that the series is stationary because $\tau_1=-7.35<-1.95$.

ARIMA Subset Search with AIC/BICΒΆ

InΒ [297]:
p = 4       # Maximum AR order
q = 4       # Maximum MA order
pq = (p+1) * (q+1)   # Total number of (p,q) combinations including 0
resAIC = rep(0, pq)  # Storage for AIC values
resBIC = rep(0, pq)  # Storage for BIC values
pdf = rep(0, pq)     # Storage for AR order (p)
qdf = rep(0, pq)     # Storage for MA order (q)

# if we condisider subarima models, where coefficients can be or not be, then we jump to 2^(p+q) models. but this function does not consider this. the next section does.

k = 0  # Counter for results
for (i in 0:p) {
    for (j in 0:q) {
        k = k + 1
        # Estimate ARMA(p,q) model for stationary series dy
        model = arima(dBrent_log, order = c(i,0,j), include.mean = FALSE)
        
        # Store information criteria
        resAIC[k] = AIC(model)
        resBIC[k] = BIC(model)
        
        # Store corresponding lag orders
        pdf[k] = i
        qdf[k] = j
    }
}

# Combine and display results: p, q, AIC, and BIC
results = cbind(pdf, qdf, data.frame(resAIC), data.frame(resBIC))
results 

# Find the model with the lowest AIC
bestAIC = results[which.min(results$resAIC), ]   # Row with min AIC

# Find the model with the lowest BIC
bestBIC = results[which.min(results$resBIC), ]   # Row with min BIC

# Display them
print("According to the AIC information criterion, the best model is")
bestAIC

print("According to the BIC information criterion, the best model is")
bestBIC
A data.frame: 25 Γ— 4
pdfqdfresAICresBIC
<dbl><dbl><dbl><dbl>
00-501.7018-498.0013
01-520.8478-513.4469
02-520.4051-509.3038
03-518.5640-503.7622
04-521.7599-503.2576
10-516.5980-509.1971
11-520.5651-509.4637
12-521.0218-506.2201
13-519.1545-500.6523
14-520.5130-498.3103
20-520.7810-509.6797
21-519.8568-505.0550
22-519.2473-500.7450
23-520.8788-498.6762
24-518.5790-492.6759
30-519.1719-504.3701
31-520.7860-502.2838
32-520.6640-498.4614
33-516.4526-490.5495
34-531.3006-501.6970
40-519.1070-500.6047
41-520.1893-497.9866
42-531.5167-505.6136
43-528.8733-499.2698
44-527.6806-494.3767
[1] "According to the AIC information criterion, the best model is"
A data.frame: 1 Γ— 4
pdfqdfresAICresBIC
<dbl><dbl><dbl><dbl>
2342-531.5167-505.6136
[1] "According to the BIC information criterion, the best model is"
A data.frame: 1 Γ— 4
pdfqdfresAICresBIC
<dbl><dbl><dbl><dbl>
201-520.8478-513.4469
InΒ [298]:
fit_uni <- auto.arima(dBrent_log, 
    max.d = 0,    # no differencing allowed (series already made stationary)
    max.p = 4,    # maximum AR order (p)
    max.q = 4,    # maximum MA order (q)
    max.D = 0,    # no seasonal differencing
    max.P = 0,    # no seasonal AR terms
    max.Q = 0,    # no seasonal MA terms
    ic = "aic",   # choose the model that minimizes AIC
    stepwise = FALSE,     # search over all models (not just stepwise path)
    approximation = FALSE # use exact maximum likelihood (slower but more accurate)
)
summary(fit_uni)
Series: dBrent_log 
ARIMA(0,0,4) with zero mean 

Coefficients:
         ma1      ma2      ma3      ma4
      0.2754  -0.0875  -0.0653  -0.1317
s.e.  0.0579   0.0591   0.0631   0.0576

sigma^2 = 0.01002:  log likelihood = 265.88
AIC=-521.76   AICc=-521.56   BIC=-503.26

Training set error measures:
                      ME       RMSE        MAE      MPE    MAPE     MASE
Training set 0.003633379 0.09941295 0.07207996 94.73877 169.993 0.692788
                     ACF1
Training set -0.005245346

We evaluated autoregressive moving average (ARMA) models with up to 4 lags in both the autoregressive and moving average terms. This meant that a total of 25 models (all combinations of p = 0,…,4 and q = 0,…,4) were estimated. The choice of 4 lags reflects a balance between allowing enough dynamics to capture quarterly fluctuations in GDP and keeping the model space manageable for interpretation.

The models were compared using the Akaike Information Criterion (AIC), and the search considered all possible specifications within this range (an exact search rather than a stepwise approximation). Likelihood estimation was carried out without approximation. Based on this procedure, the ARMA(0,0) model, which corresponds to white noise, was selected as the best fitting model.

INCLUDE THIS:

An exhaustive ARMA(p,q) search with AIC favored a higher-order ARMA(4,2) model, which is common when using AIC because its penalty for additional parameters is relatively mild; however, auto.arima, which uses AICc and enforces stationarity/invertibility constraints, selected a more parsimonious ARMA(0,4) model. We therefore use the auto.arima model as the benchmark, consistent with forecasting practice and the course methodology.

Finally, we decided to let auto.arima find the best model without restrictions so we ran:

InΒ [299]:
fit_uni <- auto.arima(dBrent_log, 
    max.d = 0,    # no differencing allowed (series already made stationary)
    # max.p = 4,    # maximum AR order (p)
    # max.q = 4,    # maximum MA order (q)
    # max.D = 0,    # no seasonal differencing
    # max.P = 0,    # no seasonal AR terms
    # max.Q = 0,    # no seasonal MA terms
    ic = "aic",   # choose the model that minimizes AIC
    stepwise = FALSE,     # search over all models (not just stepwise path)
    approximation = FALSE # use exact maximum likelihood (slower but more accurate)
)
summary(fit_uni)
Series: dBrent_log 
ARIMA(0,0,4) with zero mean 

Coefficients:
         ma1      ma2      ma3      ma4
      0.2754  -0.0875  -0.0653  -0.1317
s.e.  0.0579   0.0591   0.0631   0.0576

sigma^2 = 0.01002:  log likelihood = 265.88
AIC=-521.76   AICc=-521.56   BIC=-503.26

Training set error measures:
                      ME       RMSE        MAE      MPE    MAPE     MASE
Training set 0.003633379 0.09941295 0.07207996 94.73877 169.993 0.692788
                     ACF1
Training set -0.005245346

Interpretation of the Univariate Benchmark ModelΒΆ

Using exact maximum likelihood (approximation = FALSE), auto.arima selects an $$ \text{ARIMA}(0,0,4) $$ model for the stationary series of log-differenced Brent prices,
$$ dBrent\_{log,t} = \Delta \log(Brent_t), $$ with zero mean. Because the series has already been differenced and is centered around zero, the model contains only MA terms, capturing short-run, shock-driven dynamics typical of commodity returns.

The estimated model is:

$$ dBrent\_{log,t} = \theta_1 \varepsilon_{t-1} + \theta_2 \varepsilon_{t-2} + \theta_3 \varepsilon_{t-3} + \theta_4 \varepsilon_{t-4} + \varepsilon_t, $$

with the following coefficients:

  • $\theta_1 = 0.2754$ (significant, positive short-run persistence)
  • $\theta_2 = -0.0875$
  • $\theta_3 = -0.0653$
  • $\theta_4 = -0.1317$

These MA terms reflect how past shocks (news surprises) propagate through monthly oil returns. The absence of AR terms indicates that the series does not exhibit meaningful autoregressive structure at monthly frequency once log-differenced.

The fit is strong:

  • Residual variance: $$\sigma^2 \approx 0.0100$$
  • AIC and AICc are very low (βˆ’521.76 and βˆ’521.56), indicating good in-sample fit
  • BIC = βˆ’503.26 also supports the model’s parsimony
  • The first residual autocorrelation is essentially zero (ACF1 β‰ˆ βˆ’0.005), showing that the MA(4) structure captures all meaningful time dependence.

Overall, this ARIMA(0,0,4) model is a clean and parsimonious univariate benchmark: it treats monthly Brent log-returns as a short-memory process driven by the last few shocks, with no trend, drift, or autoregressive components. This serves as the baseline against which multivariate models (ADL or VAR with macroeconomic, financial, and supply-side predictors) will later be compared.

This ARIMA model forms our univariate benchmark, against which we later compare multivariate models incorporating macroeconomic, financial, and supply-side predictors.

InΒ [300]:
# dBrent_log is your xts series of log-returns
dates <- index(dBrent_log)

# Fitted values from ARIMA, coerced to numeric and put on same dates
fitted_xts <- xts(as.numeric(fitted(fit_uni)), order.by = dates)

# Combine actual and fitted into one xts object
plot_xts <- merge(Actual = dBrent_log, Fitted = fitted_xts)

# Build data frame for ggplot
plot_df <- data.frame(
  Date   = index(plot_xts),
  Actual = as.numeric(plot_xts$Actual),
  Fitted = as.numeric(plot_xts$Fitted)
)

options(repr.plot.width = 20, repr.plot.height = 10)

fig <- ggplot(plot_df, aes(x = Date)) +
  geom_line(aes(y = Actual, colour = "Actual"), linewidth = 0.7) +
  geom_line(aes(y = Fitted, colour = "Fitted"), linewidth = 0.6) +
  scale_colour_manual(values = c("Actual" = "black", "Fitted" = "red")) +
  theme_light() +
  theme(
    plot.margin = ggplot2::margin(1, 1, 1, 1, "cm"),
    text = element_text(size = 25),
    legend.title = element_blank()
  ) +
  labs(
    title = "ARIMA(0,0,4) – Actual vs Fitted dBrent_log",
    x = "Month - Year",
    y = "dBrent_log"
  )

fig
No description has been provided for this image
InΒ [301]:
dim(dBrent_log)
dim(fitted_xts)

head(dBrent_log)
head(fitted_xts)
  1. 299
  2. 1
  1. 299
  2. 1
                [,1]
Feb 2000  0.08524581
Mar 2000 -0.01049404
Apr 2000 -0.18881769
May 2000  0.19787081
Jun 2000  0.07163298
Jul 2000 -0.03830838
                 [,1]
Feb 2000  0.004154238
Mar 2000  0.019868432
Apr 2000 -0.017301407
May 2000 -0.050741019
Jun 2000  0.073614611
Jul 2000 -0.006555693

In an MA(4) model, the fitted value at time t depends on the previous four shocks, which are not directly observable at the start of the sample. However, auto.arima estimates these initial shocks using maximum likelihood and backcasting, allowing the model to produce fitted values for the entire sample without introducing missing observations. This is standard in ARIMA implementations: initial states are treated as parameters and inferred jointly with the MA coefficients. As a result, fitted(fit_uni) contains valid values from the first period onward, even though the early fitted points rely more heavily on the estimated pre-sample shocks.

Although the ARIMA model on log-returns cannot reproduce historical price levels, its forecasts remain fully meaningful: predicted log-returns can be accumulated and exponentiated to yield price forecasts. This is standard practice in financial econometrics, where stationary return models are used to generate price forecasts through the transformation (P_{t+h} = P_t \exp(\sum_{i=1}^h \widehat{d\log P}_{t+i})).

InΒ [302]:
tail(Brent)

# price forecasts
fc <- forecast(fit_uni, h = 4)

last_price <- as.numeric(last(Brent))

fc_prices <- last_price * exp(cumsum(fc$mean))
fc_prices
          [,1]
Jul 2024 85.15
Aug 2024 80.36
Sep 2024 74.02
Oct 2024 75.63
Nov 2024 74.35
Dec 2024 73.86
  1. 74.5082607531298
  2. 74.2970033763458
  3. 74.6859194252936
  4. 74.7389284720804

ADF Testing of RegressorsΒΆ

We test the stationarity of the regressors and apply required transformations to them.

InΒ [303]:
# Oil_Stocks, WIP, OPEC, BDI

Monthly OECD petroleum and other liquids stocksΒΆ

InΒ [305]:
summary(ur.df(Oil_Stocks, type='trend', lags=30, selectlags="AIC"))
############################################### 
# Augmented Dickey-Fuller Test Unit Root Test # 
############################################### 

Test regression trend 


Call:
lm(formula = z.diff ~ z.lag.1 + 1 + tt + z.diff.lag)

Residuals:
   Min     1Q Median     3Q    Max 
-83.42 -15.80   0.74  17.53 133.90 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  98.438477  43.394678   2.268 0.024145 *  
z.lag.1      -0.023730   0.010793  -2.199 0.028808 *  
tt            0.011855   0.028938   0.410 0.682396    
z.diff.lag1   0.157975   0.061095   2.586 0.010277 *  
z.diff.lag2   0.149594   0.057928   2.582 0.010374 *  
z.diff.lag3   0.202172   0.058366   3.464 0.000625 ***
z.diff.lag4  -0.037399   0.059724  -0.626 0.531747    
z.diff.lag5  -0.076052   0.059790  -1.272 0.204543    
z.diff.lag6   0.008252   0.059923   0.138 0.890579    
z.diff.lag7  -0.103241   0.059110  -1.747 0.081921 .  
z.diff.lag8   0.067807   0.059440   1.141 0.255045    
z.diff.lag9   0.034403   0.059762   0.576 0.565345    
z.diff.lag10  0.064062   0.059674   1.074 0.284055    
z.diff.lag11 -0.037634   0.058673  -0.641 0.521830    
z.diff.lag12  0.381223   0.058390   6.529  3.6e-10 ***
z.diff.lag13 -0.156074   0.062761  -2.487 0.013535 *  
---
Signif. codes:  0 β€˜***’ 0.001 β€˜**’ 0.01 β€˜*’ 0.05 β€˜.’ 0.1 β€˜ ’ 1

Residual standard error: 30.26 on 253 degrees of freedom
Multiple R-squared:  0.2983,	Adjusted R-squared:  0.2567 
F-statistic: 7.169 on 15 and 253 DF,  p-value: 4.498e-13


Value of test-statistic is: -2.1987 1.8629 2.7943 

Critical values for test statistics: 
      1pct  5pct 10pct
tau3 -3.98 -3.42 -3.13
phi2  6.15  4.71  4.05
phi3  8.34  6.30  5.36

We cannot reject the null hypothesis that the constant and deterministic trend are jointly zero because $\varphi_2=1.86<4.71$.

optional: (We also cannot reject that the deterministic trend is zero because $\varphi_3=2.79<6.30$.)

Therefore we re-run the ADF test with specification = "none" (no constant, no determistic trend).

InΒ [308]:
summary(ur.df(Oil_Stocks, type='none', lags=30, selectlags="AIC"))
############################################### 
# Augmented Dickey-Fuller Test Unit Root Test # 
############################################### 

Test regression none 


Call:
lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)

Residuals:
   Min     1Q Median     3Q    Max 
-77.57 -17.42   1.55  16.02 128.44 

Coefficients:
               Estimate Std. Error t value Pr(>|t|)    
z.lag.1      -0.0000493  0.0004387  -0.112  0.91061    
z.diff.lag1   0.1592557  0.0614759   2.591  0.01013 *  
z.diff.lag2   0.1391212  0.0578646   2.404  0.01692 *  
z.diff.lag3   0.1915701  0.0583661   3.282  0.00117 ** 
z.diff.lag4  -0.0527497  0.0595435  -0.886  0.37650    
z.diff.lag5  -0.0920165  0.0595558  -1.545  0.12358    
z.diff.lag6  -0.0069859  0.0597194  -0.117  0.90697    
z.diff.lag7  -0.1144304  0.0591365  -1.935  0.05409 .  
z.diff.lag8   0.0586134  0.0595379   0.984  0.32582    
z.diff.lag9   0.0238189  0.0598039   0.398  0.69075    
z.diff.lag10  0.0541752  0.0597580   0.907  0.36549    
z.diff.lag11 -0.0516730  0.0584254  -0.884  0.37730    
z.diff.lag12  0.3651596  0.0579236   6.304 1.27e-09 ***
z.diff.lag13 -0.1833568  0.0615107  -2.981  0.00315 ** 
---
Signif. codes:  0 β€˜***’ 0.001 β€˜**’ 0.01 β€˜*’ 0.05 β€˜.’ 0.1 β€˜ ’ 1

Residual standard error: 30.47 on 255 degrees of freedom
Multiple R-squared:  0.2828,	Adjusted R-squared:  0.2434 
F-statistic: 7.182 on 14 and 255 DF,  p-value: 1.592e-12


Value of test-statistic is: -0.1124 

Critical values for test statistics: 
      1pct  5pct 10pct
tau1 -2.58 -1.95 -1.62

We cannot reject the null that there exists a unit root at the 5pct significance level since $\tau_1=-0.11>-1.95$.

Therefore we difference the series and take the log as the series has a clear upward trend and its variance grows with the level. Log-differencing stabilizes the variance and turns level changes into approximate percentage changes, making the transformed series much closer to weakly stationary. If you only difference levels, you often remove the trend but leave heteroskedasticity and scale effects, which leads to poorer ARIMA/VAR performance.

InΒ [309]:
Oil_Stocks_log <- log(Oil_Stocks)
dOil_Stocks_log = diff(Oil_Stocks_log)

# diff() always produces an NA in the first observation (there is no lagged value for t=1).
dOil_Stocks_log  <- na.omit(dOil_Stocks_log)
InΒ [311]:
summary(ur.df(dOil_Stocks_log, type='none', lags=30, selectlags="AIC"))
############################################### 
# Augmented Dickey-Fuller Test Unit Root Test # 
############################################### 

Test regression none 


Call:
lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)

Residuals:
       Min         1Q     Median         3Q        Max 
-0.0196585 -0.0039973  0.0002265  0.0037479  0.0282856 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
z.lag.1      -0.518077   0.151567  -3.418 0.000734 ***
z.diff.lag1  -0.344925   0.152969  -2.255 0.024990 *  
z.diff.lag2  -0.203051   0.147825  -1.374 0.170777    
z.diff.lag3  -0.008394   0.142398  -0.059 0.953042    
z.diff.lag4  -0.063899   0.135330  -0.472 0.637207    
z.diff.lag5  -0.157040   0.127390  -1.233 0.218806    
z.diff.lag6  -0.166712   0.117649  -1.417 0.157696    
z.diff.lag7  -0.280178   0.110117  -2.544 0.011538 *  
z.diff.lag8  -0.228228   0.103512  -2.205 0.028358 *  
z.diff.lag9  -0.205143   0.098066  -2.092 0.037439 *  
z.diff.lag10 -0.144993   0.092872  -1.561 0.119712    
z.diff.lag11 -0.189079   0.081131  -2.331 0.020557 *  
z.diff.lag12  0.172815   0.061580   2.806 0.005397 ** 
---
Signif. codes:  0 β€˜***’ 0.001 β€˜**’ 0.01 β€˜*’ 0.05 β€˜.’ 0.1 β€˜ ’ 1

Residual standard error: 0.007212 on 255 degrees of freedom
Multiple R-squared:  0.6023,	Adjusted R-squared:  0.5821 
F-statistic: 29.71 on 13 and 255 DF,  p-value: < 2.2e-16


Value of test-statistic is: -3.4181 

Critical values for test statistics: 
      1pct  5pct 10pct
tau1 -2.58 -1.95 -1.62

The transformed series is now stationary since we reject the null hypothesis of the presence of a unit root because $\tau_1=-3.42<-2.58$.

InΒ [336]:
options(repr.plot.width=30, repr.plot.height=20)

fig = autoplot(dOil_Stocks_log, size = .8, colour = 'black')
fig = fig +
theme_light() +
theme(plot.margin = ggplot2::margin(1, 1, 1, 1, "cm")) +
theme(text=element_text(size=30)) +
labs(x = "Month -  Year") +
labs(y = "Transformed Monthly OECD petroleum and other liquids stocks")
fig
No description has been provided for this image

World Industrial Production index (WIP)ΒΆ

InΒ [312]:
summary(ur.df(WIP, type='trend', lags=30, selectlags="AIC"))
############################################### 
# Augmented Dickey-Fuller Test Unit Root Test # 
############################################### 

Test regression trend 


Call:
lm(formula = z.diff ~ z.lag.1 + 1 + tt + z.diff.lag)

Residuals:
    Min      1Q  Median      3Q     Max 
-8.9913 -0.2618  0.0944  0.5366  4.9502 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 15.85439    4.75024   3.338 0.001332 ** 
z.lag.1     -0.18306    0.05406  -3.386 0.001144 ** 
tt           0.03832    0.01305   2.938 0.004425 ** 
z.diff.lag   0.39480    0.10670   3.700 0.000415 ***
---
Signif. codes:  0 β€˜***’ 0.001 β€˜**’ 0.01 β€˜*’ 0.05 β€˜.’ 0.1 β€˜ ’ 1

Residual standard error: 1.549 on 73 degrees of freedom
Multiple R-squared:  0.2204,	Adjusted R-squared:  0.1884 
F-statistic: 6.881 on 3 and 73 DF,  p-value: 0.0003813


Value of test-statistic is: -3.3862 4.0205 5.8195 

Critical values for test statistics: 
      1pct  5pct 10pct
tau3 -3.99 -3.43 -3.13
phi2  6.22  4.75  4.07
phi3  8.43  6.49  5.47

We cannot reject the null hypothesis that the constant and deterministic trend are jointly zero because $\varphi_2=4.02<4.75$.

optional: (We also cannot reject that the deterministic trend is zero because $\varphi_3=5.82<6.49$.)

Therefore we re-run the ADF test with specification = "none" (no constant, no determistic trend).

InΒ [313]:
summary(ur.df(WIP, type='none', lags=30, selectlags="AIC"))
############################################### 
# Augmented Dickey-Fuller Test Unit Root Test # 
############################################### 

Test regression none 


Call:
lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)

Residuals:
    Min      1Q  Median      3Q     Max 
-8.0803 -0.3883  0.0152  0.5767  6.8035 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)   
z.lag.1      0.001254   0.001849   0.678  0.49993   
z.diff.lag1  0.376357   0.113882   3.305  0.00147 **
z.diff.lag2 -0.212485   0.113944  -1.865  0.06617 . 
---
Signif. codes:  0 β€˜***’ 0.001 β€˜**’ 0.01 β€˜*’ 0.05 β€˜.’ 0.1 β€˜ ’ 1

Residual standard error: 1.62 on 74 degrees of freedom
Multiple R-squared:  0.1433,	Adjusted R-squared:  0.1085 
F-statistic: 4.125 on 3 and 74 DF,  p-value: 0.009227


Value of test-statistic is: 0.6779 

Critical values for test statistics: 
      1pct  5pct 10pct
tau1 -2.58 -1.95 -1.62

We cannot reject the null that there exists a unit root at the 5pct significance level since $\tau_1=-0.68>-1.95$. The World Industrial Production Index displays a clear upward trend before and after the COVID-19 shock, with a large temporary collapse in early 2020. Because the level of the series grows over time and the variance appears to increase with the level, the process is unlikely to be stationary in levels. To obtain a stationary transformation suitable for ARIMA/ADL modelling, the series should be log-transformed and then differenced. Log-differencing stabilizes the variance and converts growth in the index into approximate percentage changes, producing a series that fluctuates around a stable mean without a deterministic trend.

InΒ [320]:
WIP_log <- log(WIP)
dWIP_log = diff(WIP_log)

# diff() always produces an NA in the first observation (there is no lagged value for t=1).
dWIP_log  <- na.omit(dWIP_log)
InΒ [321]:
summary(ur.df(dWIP_log, type='none', lags=30, selectlags="AIC"))
############################################### 
# Augmented Dickey-Fuller Test Unit Root Test # 
############################################### 

Test regression none 


Call:
lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)

Residuals:
       Min         1Q     Median         3Q        Max 
-0.0023731 -0.0004836  0.0000248  0.0004469  0.0033737 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
z.lag.1      -0.519391   0.152078  -3.415 0.000741 ***
z.diff.lag1  -0.346175   0.153434  -2.256 0.024907 *  
z.diff.lag2  -0.204046   0.148273  -1.376 0.169982    
z.diff.lag3  -0.009196   0.142813  -0.064 0.948709    
z.diff.lag4  -0.065071   0.135697  -0.480 0.631971    
z.diff.lag5  -0.158288   0.127708  -1.239 0.216317    
z.diff.lag6  -0.168202   0.117925  -1.426 0.154993    
z.diff.lag7  -0.281667   0.110352  -2.552 0.011281 *  
z.diff.lag8  -0.230547   0.103720  -2.223 0.027107 *  
z.diff.lag9  -0.207517   0.098251  -2.112 0.035649 *  
z.diff.lag10 -0.146693   0.093030  -1.577 0.116074    
z.diff.lag11 -0.189948   0.081234  -2.338 0.020146 *  
z.diff.lag12  0.171498   0.061588   2.785 0.005761 ** 
---
Signif. codes:  0 β€˜***’ 0.001 β€˜**’ 0.01 β€˜*’ 0.05 β€˜.’ 0.1 β€˜ ’ 1

Residual standard error: 0.0008643 on 255 degrees of freedom
Multiple R-squared:  0.6029,	Adjusted R-squared:  0.5826 
F-statistic: 29.78 on 13 and 255 DF,  p-value: < 2.2e-16


Value of test-statistic is: -3.4153 

Critical values for test statistics: 
      1pct  5pct 10pct
tau1 -2.58 -1.95 -1.62

The transformed series is now stationary since we reject the null hypothesis of the presence of a unit root because $\tau_1=-3.42<-1.95$.

InΒ [335]:
options(repr.plot.width=30, repr.plot.height=20)

fig = autoplot(dWIP_log, size = .8, colour = 'black')
fig = fig +
theme_light() +
theme(plot.margin = ggplot2::margin(1, 1, 1, 1, "cm")) +
theme(text=element_text(size=30)) +
labs(x = "Month -  Year") +
labs(y = "Transformed World Industrial Production index")
fig
No description has been provided for this image

OPEC Production (million b/d)ΒΆ

InΒ [323]:
summary(ur.df(OPEC, type='trend', lags=30, selectlags="AIC"))
############################################### 
# Augmented Dickey-Fuller Test Unit Root Test # 
############################################### 

Test regression trend 


Call:
lm(formula = z.diff ~ z.lag.1 + 1 + tt + z.diff.lag)

Residuals:
    Min      1Q  Median      3Q     Max 
-5.7123 -0.2018  0.0145  0.2857  1.9667 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  1.9282081  0.5657624   3.408 0.000756 ***
z.lag.1     -0.0620135  0.0192209  -3.226 0.001412 ** 
tt          -0.0004843  0.0004496  -1.077 0.282385    
z.diff.lag1 -0.0200903  0.0597006  -0.337 0.736748    
z.diff.lag2 -0.1208657  0.0596477  -2.026 0.043737 *  
---
Signif. codes:  0 β€˜***’ 0.001 β€˜**’ 0.01 β€˜*’ 0.05 β€˜.’ 0.1 β€˜ ’ 1

Residual standard error: 0.5667 on 264 degrees of freedom
Multiple R-squared:  0.06198,	Adjusted R-squared:  0.04777 
F-statistic: 4.361 on 4 and 264 DF,  p-value: 0.001972


Value of test-statistic is: -3.2264 4.212 6.1307 

Critical values for test statistics: 
      1pct  5pct 10pct
tau3 -3.98 -3.42 -3.13
phi2  6.15  4.71  4.05
phi3  8.34  6.30  5.36

We cannot reject the null hypothesis that the constant and deterministic trend are jointly zero because $\varphi_2=4.21<4.71$.

optional: (We also cannot reject that the deterministic trend is zero because $\varphi_3=6.13<6.30$.)

Therefore we re-run the ADF test with specification = "none" (no constant, no determistic trend).

InΒ [326]:
summary(ur.df(OPEC, type='none', lags=30, selectlags="AIC"))
############################################### 
# Augmented Dickey-Fuller Test Unit Root Test # 
############################################### 

Test regression none 


Call:
lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)

Residuals:
    Min      1Q  Median      3Q     Max 
-5.7916 -0.1894  0.0262  0.2684  2.3263 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)  
z.lag.1      0.0004771  0.0011952   0.399   0.6901  
z.diff.lag1 -0.0315418  0.0603874  -0.522   0.6019  
z.diff.lag2 -0.1315419  0.0603715  -2.179   0.0302 *
---
Signif. codes:  0 β€˜***’ 0.001 β€˜**’ 0.01 β€˜*’ 0.05 β€˜.’ 0.1 β€˜ ’ 1

Residual standard error: 0.5778 on 266 degrees of freedom
Multiple R-squared:  0.01859,	Adjusted R-squared:  0.00752 
F-statistic: 1.679 on 3 and 266 DF,  p-value: 0.1718


Value of test-statistic is: 0.3992 

Critical values for test statistics: 
      1pct  5pct 10pct
tau1 -2.58 -1.95 -1.62

We cannot reject the null that there exists a unit root at the 5pct significance level since $\tau_1=0.3992>-1.95$. OPEC crude oil production is clearly non-stationary in levels. The series exhibits long swings, persistent upward and downward movements, and clear structural breaks β€” most notably the collapse during the 2008–09 recession and the extreme COVID-19 drop in 2020. These level shifts imply that both the mean and the variance change over time, violating weak stationarity. To obtain a transformation suitable for ARIMA or ADL modelling, the series should be log-transformed and then differenced. Log-differencing stabilizes the scale, converts changes in production into approximate percentage changes, and produces a growth-rate series that fluctuates around a constant mean and is therefore much closer to stationarity.

InΒ [330]:
OPEC_log <- log(OPEC)
dOPEC_log = diff(OPEC_log)

# diff() always produces an NA in the first observation (there is no lagged value for t=1).
dOPEC_log  <- na.omit(dOPEC_log)
InΒ [332]:
summary(ur.df(dOPEC_log, type='none', lags=30, selectlags="AIC"))
############################################### 
# Augmented Dickey-Fuller Test Unit Root Test # 
############################################### 

Test regression none 


Call:
lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)

Residuals:
      Min        1Q    Median        3Q       Max 
-0.211770 -0.005906  0.001522  0.009675  0.093641 

Coefficients:
           Estimate Std. Error t value Pr(>|t|)    
z.lag.1    -1.14594    0.08534 -13.428   <2e-16 ***
z.diff.lag  0.14847    0.05997   2.476   0.0139 *  
---
Signif. codes:  0 β€˜***’ 0.001 β€˜**’ 0.01 β€˜*’ 0.05 β€˜.’ 0.1 β€˜ ’ 1

Residual standard error: 0.02068 on 266 degrees of freedom
Multiple R-squared:  0.5101,	Adjusted R-squared:  0.5065 
F-statistic: 138.5 on 2 and 266 DF,  p-value: < 2.2e-16


Value of test-statistic is: -13.4285 

Critical values for test statistics: 
      1pct  5pct 10pct
tau1 -2.58 -1.95 -1.62

The transformed series is now stationary since we reject the null hypothesis of the presence of a unit root because $\tau_1=-13.43<-1.95$.

InΒ [334]:
options(repr.plot.width=30, repr.plot.height=20)

fig = autoplot(dOPEC_log, size = .8, colour = 'black')
fig = fig +
theme_light() +
theme(plot.margin = ggplot2::margin(1, 1, 1, 1, "cm")) +
theme(text=element_text(size=30)) +
labs(x = "Month -  Year") +
labs(y = "Transformed OPEC Production (million b/d)")
fig
No description has been provided for this image

Baltic Dry Index (BADI)ΒΆ

InΒ [337]:
summary(ur.df(BDI, type='trend', lags=30, selectlags="AIC"))
############################################### 
# Augmented Dickey-Fuller Test Unit Root Test # 
############################################### 

Test regression trend 


Call:
lm(formula = z.diff ~ z.lag.1 + 1 + tt + z.diff.lag)

Residuals:
     Min       1Q   Median       3Q      Max 
-2322.23  -235.44   -35.84   215.90  2770.72 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept) 320.662483 125.055729   2.564 0.010901 *  
z.lag.1      -0.068877   0.022234  -3.098 0.002162 ** 
tt           -0.976491   0.527081  -1.853 0.065059 .  
z.diff.lag1   0.222114   0.058754   3.780 0.000194 ***
z.diff.lag2   0.001097   0.060312   0.018 0.985501    
z.diff.lag3   0.088752   0.060353   1.471 0.142610    
z.diff.lag4  -0.246625   0.059765  -4.127 4.95e-05 ***
---
Signif. codes:  0 β€˜***’ 0.001 β€˜**’ 0.01 β€˜*’ 0.05 β€˜.’ 0.1 β€˜ ’ 1

Residual standard error: 595.3 on 262 degrees of freedom
Multiple R-squared:  0.1477,	Adjusted R-squared:  0.1282 
F-statistic: 7.567 on 6 and 262 DF,  p-value: 1.691e-07


Value of test-statistic is: -3.0978 3.2749 4.9123 

Critical values for test statistics: 
      1pct  5pct 10pct
tau3 -3.98 -3.42 -3.13
phi2  6.15  4.71  4.05
phi3  8.34  6.30  5.36

We cannot reject the null hypothesis that the constant and deterministic trend are jointly zero because $\varphi_2=3.27<4.71$.

optional: (We also cannot reject that the deterministic trend is zero because $\varphi_3=4.91<6.30$.)

Therefore we re-run the ADF test with specification = "none" (no constant, no determistic trend).

InΒ [338]:
summary(ur.df(BDI, type='none', lags=30, selectlags="AIC"))
############################################### 
# Augmented Dickey-Fuller Test Unit Root Test # 
############################################### 

Test regression none 


Call:
lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)

Residuals:
     Min       1Q   Median       3Q      Max 
-2479.85  -171.08    51.38   295.51  2641.38 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
z.lag.1     -0.01924    0.01242  -1.550 0.122359    
z.diff.lag1  0.20508    0.05897   3.478 0.000592 ***
z.diff.lag2 -0.02245    0.06024  -0.373 0.709661    
z.diff.lag3  0.06657    0.06038   1.103 0.271212    
z.diff.lag4 -0.27434    0.05943  -4.616 6.11e-06 ***
---
Signif. codes:  0 β€˜***’ 0.001 β€˜**’ 0.01 β€˜*’ 0.05 β€˜.’ 0.1 β€˜ ’ 1

Residual standard error: 601.3 on 264 degrees of freedom
Multiple R-squared:  0.1237,	Adjusted R-squared:  0.1071 
F-statistic: 7.454 on 5 and 264 DF,  p-value: 1.466e-06


Value of test-statistic is: -1.5499 

Critical values for test statistics: 
      1pct  5pct 10pct
tau1 -2.58 -1.95 -1.62

We cannot reject the null that there exists a unit root at the 5pct significance level since $\tau_1=-1.5499>-1.95$. The Baltic Dry Index (BDI) is strongly non-stationary in levels. The series displays large boom–bust cycles (notably the 2003–2008 surge and the collapse that followed), long spells of persistent drift, and clear breaks around major global demand shocks such as 2009 and 2020. Both the mean and the variance change substantially over time, violating the assumptions of weak stationarity. To model the growth rate of shipping activity in a meaningful way, the series should be log-transformed and then differenced. The log-first-difference transformation stabilizes the scale, turns level swings into approximate percentage changes, and produces a series that fluctuates around a stable mean, making it far more suitable for ARIMA, ADL, or VAR analysis.

InΒ [342]:
BDI_log <- log(BDI)
dBDI_log = diff(BDI_log)

# diff() always produces an NA in the first observation (there is no lagged value for t=1).
dBDI_log  <- na.omit(dBDI_log)
InΒ [343]:
summary(ur.df(dBDI_log, type='none', lags=30, selectlags="AIC"))
############################################### 
# Augmented Dickey-Fuller Test Unit Root Test # 
############################################### 

Test regression none 


Call:
lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.32804 -0.13536  0.02243  0.15470  1.18551 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
z.lag.1     -1.65427    0.20246  -8.171 1.35e-14 ***
z.diff.lag1  0.63720    0.18389   3.465 0.000620 ***
z.diff.lag2  0.56429    0.16546   3.410 0.000752 ***
z.diff.lag3  0.56560    0.14611   3.871 0.000137 ***
z.diff.lag4  0.36894    0.12696   2.906 0.003976 ** 
z.diff.lag5  0.29367    0.10946   2.683 0.007769 ** 
z.diff.lag6  0.22010    0.08751   2.515 0.012505 *  
z.diff.lag7  0.14624    0.06183   2.365 0.018767 *  
---
Signif. codes:  0 β€˜***’ 0.001 β€˜**’ 0.01 β€˜*’ 0.05 β€˜.’ 0.1 β€˜ ’ 1

Residual standard error: 0.2683 on 260 degrees of freedom
Multiple R-squared:  0.5276,	Adjusted R-squared:  0.5131 
F-statistic:  36.3 on 8 and 260 DF,  p-value: < 2.2e-16


Value of test-statistic is: -8.1709 

Critical values for test statistics: 
      1pct  5pct 10pct
tau1 -2.58 -1.95 -1.62

The transformed series is now stationary since we reject the null hypothesis of the presence of a unit root because $\tau_1=-8.1709<-1.95$.

InΒ [346]:
options(repr.plot.width=30, repr.plot.height=20)

fig = autoplot(dBDI_log, size = .8, colour = 'black')
fig = fig +
theme_light() +
theme(plot.margin = ggplot2::margin(1, 1, 1, 1, "cm")) +
theme(text=element_text(size=30)) +
labs(x = "Month -  Year") +
labs(y = "Transformed Baltic Dry Index (BADI)")
fig
No description has been provided for this image

SVAR modelΒΆ

A VAR model is preferred when analyzing multiple interdependent time series, as it allows for feedback and mutual interactions among variables, unlike the single-equation ADL model.

We tested stationarity since it ensures that relationships between variables remain stable over time. Non-stationary data can lead to spurious results, so differencing or a VECM formulation is needed if variables are cointegrated.